skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Li, Wentao"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Nikolski, Macha (Ed.)
    Abstract MotivationGenome-wide association studies (GWAS) benefit from the increasing availability of genomic data and cross-institution collaborations. However, sharing data across institutional boundaries jeopardizes medical data confidentiality and patient privacy. While modern cryptographic techniques provide formal secure guarantees, the substantial communication and computational overheads hinder the practical application of large-scale collaborative GWAS. ResultsThis work introduces an efficient framework for conducting collaborative GWAS on distributed datasets, maintaining data privacy without compromising the accuracy of the results. We propose a novel two-step strategy aimed at reducing communication and computational overheads, and we employ iterative and sampling techniques to ensure accurate results. We instantiate our approach using logistic regression, a commonly used statistical method for identifying associations between genetic markers and the phenotype of interest. We evaluate our proposed methods using two real genomic datasets and demonstrate their robustness in the presence of between-study heterogeneity and skewed phenotype distributions using a variety of experimental settings. The empirical results show the efficiency and applicability of the proposed method and the promise for its application for large-scale collaborative GWAS. Availability and implementationThe source code and data are available at https://github.com/amioamo/TDS. 
    more » « less
  2. Abstract Electrophysiologic disturbances due to neurodegenerative disorders such as Alzheimer’s disease and Lewy Body disease are detectable by scalp EEG and can serve as a functional measure of disease severity. Traditional quantitative methods of EEG analysis often require an a-priori selection of clinically meaningful EEG features and are susceptible to bias, limiting the clinical utility of routine EEGs in the diagnosis and management of neurodegenerative disorders. We present a data-driven tensor decomposition approach to extract the top 6 spectral and spatial features representing commonly known sources of EEG activity during eyes-closed wakefulness. As part of their neurologic evaluation at Mayo Clinic, 11 001 patients underwent 12 176 routine, standard 10–20 scalp EEG studies. From these raw EEGs, we developed an algorithm based on posterior alpha activity and eye movement to automatically select awake-eyes-closed epochs and estimated average spectral power density (SPD) between 1 and 45 Hz for each channel. We then created a three-dimensional (3D) tensor (record × channel × frequency) and applied a canonical polyadic decomposition to extract the top six factors. We further identified an independent cohort of patients meeting consensus criteria for mild cognitive impairment (30) or dementia (39) due to Alzheimer’s disease and dementia with Lewy Bodies (31) and similarly aged cognitively normal controls (36). We evaluated the ability of the six factors in differentiating these subgroups using a Naïve Bayes classification approach and assessed for linear associations between factor loadings and Kokmen short test of mental status scores, fluorodeoxyglucose (FDG) PET uptake ratios and CSF Alzheimer’s Disease biomarker measures. Factors represented biologically meaningful brain activities including posterior alpha rhythm, anterior delta/theta rhythms and centroparietal beta, which correlated with patient age and EEG dysrhythmia grade. These factors were also able to distinguish patients from controls with a moderate to high degree of accuracy (Area Under the Curve (AUC) 0.59–0.91) and Alzheimer’s disease dementia from dementia with Lewy Bodies (AUC 0.61). Furthermore, relevant EEG features correlated with cognitive test performance, PET metabolism and CSF AB42 measures in the Alzheimer’s subgroup. This study demonstrates that data-driven approaches can extract biologically meaningful features from population-level clinical EEGs without artefact rejection or a-priori selection of channels or frequency bands. With continued development, such data-driven methods may improve the clinical utility of EEG in memory care by assisting in early identification of mild cognitive impairment and differentiating between different neurodegenerative causes of cognitive impairment. 
    more » « less
  3. Approximate confidence distribution computing (ACDC) offers a new take on the rapidly developing field of likelihood-free inference from within a frequentist framework. The appeal of this computational method for statistical inference hinges upon the concept of a confidence distribution, a special type of estimator which is defined with respect to the repeated sampling principle. An ACDC method provides frequentist validation for computational inference in problems with unknown or intractable likelihoods. The main theoretical contribution of this work is the identification of a matching condition necessary for frequentist validity of inference from this method. In addition to providing an example of how a modern understanding of confidence distribution theory can be used to connect Bayesian and frequentist inferential paradigms, we present a case to expand the current scope of so-called approximate Bayesian inference to include non-Bayesian inference by targeting a confidence distribution rather than a posterior. The main practical contribution of this work is the development of a data-driven approach to drive ACDC in both Bayesian or frequentist contexts. The ACDC algorithm is data-driven by the selection of a data-dependent proposal function, the structure of which is quite general and adaptable to many settings. We explore three numerical examples that both verify the theoretical arguments in the development of ACDC and suggest instances in which ACDC outperform approximate Bayesian computing methods computationally. 
    more » « less
  4. null (Ed.)
    A rapid and sensitive method is described for measuring perchlorate (ClO 4 − ), chlorate (ClO 3 − ), chlorite (ClO 2 − ), bromate (BrO 3 − ), and iodate (IO 3 − ) ions in natural and treated waters using non-suppressed ion chromatography with electrospray ionization and tandem mass spectrometry (NS-IC-MS/MS). Major benefits of the NS-IC-MS/MS method include a short analysis time (12 minutes), low limits of quantification for BrO 3 − (0.10 μg L −1 ), ClO 4 − (0.06 μg L −1 ), ClO 3 − (0.80 μg L −1 ), and ClO 2 − (0.40 μg L −1 ), and compatibility with conventional LC-MS/MS instrumentation. Chromatographic separations were generally performed under isocratic conditions with a Thermo Scientific Dionex AS16 column, using a mobile phase of 20% 1 M aqueous methylamine and 80% acetonitrile. The isocratic method can also be optimized for IO 3 − analysis by including a gradient from the isocratic mobile phase to 100% 1 M aqueous methylamine. Four common anions (Cl − , Br − , SO 4 2− , and HCO 3 − /CO 3 2− ), a natural organic matter isolate (Suwannee River NOM), and several real water samples were tested to examine influences of natural water constituents on oxyhalide detection. Only ClO 2 − quantification was significantly affected – by elevated chloride concentrations (>2 mM) and NOM. The method was successfully applied to quantify oxyhalides in natural waters, chlorinated tap water, and waters subjected to advanced oxidation by sunlight-driven photolysis of free available chlorine (sunlight/FAC). Sunlight/FAC treatment of NOM-free waters containing 200 μg L −1 Br − resulted in formation of up to 263 ± 35 μg L −1 and 764 ± 54 μg L −1 ClO 3 − , and up to 20.1 ± 1.0 μg L −1 and 33.8 ± 1.0 μg L −1 BrO 3 − (at pH 6 and 8, respectively). NOM strongly inhibited ClO 3 − and BrO 3 − formation, likely by scavenging reactive oxygen or halogen species. As prior work shows that the greatest benefits in applying the sunlight/FAC process for purposes of improving disinfection of chlorine-resistant microorganisms are realized in waters with lower DOC levels and higher pH, it may therefore be desirable to limit potential applications to waters containing moderate DOC concentrations ( e.g. , ∼1–2 mg C L −1 ), low Br − concentrations ( e.g. , <50 μg L −1 ), and circumneutral to moderately alkaline pH ( e.g. , pH 7–8) to strike a balance between maximizing microbial inactivation while minimizing formation of oxyhalides and other disinfection byproducts. 
    more » « less